Vehicle Detection and Tracking Project

The main steps of this project are to:

  1. Train a classifer to differentiate between cars and non-cars
  2. Use the classifer to detect and track cars in a video

Each step will be done as follows:

Train a classifer

  1. Select features to extract from the training set
  2. Prepare the data
  3. Train a classifier
  4. Test the classifer

Once a classifer has been sufficiently trained and the overall accuracy is acceptable, it can be used without retraining.

Vehicle detection

  1. Use a sliding windows of various sizes to search for positive matches
  2. Convert the detected matches into a heatmap to combine multiple detections
  3. Threshold the heatmap to remove false positives
  4. Draw a bounding box over the original frame of the video using the heatmap to determine location and size

Imports

In [1]:
import cv2
import glob
import itertools
import numpy as np
import math
import matplotlib.pyplot as plt
import pickle
import random

from scipy.ndimage.measurements import label
from sklearn.svm import LinearSVC
from sklearn.preprocessing import StandardScaler
from skimage.feature import hog

try:
    # scikit-learn version >= 0.18
    from sklearn.model_selection import train_test_split
except:
    # scikit-learn version <= 0.17
    from sklearn.cross_validation import train_test_split

# visualizations will be shown in the notebook
%matplotlib inline

Train a classifier

0. Read in the datasets

In [2]:
def read_data():
    
    """Read in the data to train and validate the classifier
    
    Returns:
        A tuple of lists for images of vehicles and images of non-vehicles"""
    
    # Used for testing purposes while flushing out the pipeline
    VEHICLES_SMALL = 'data/vehicles_smallset/*/*.jpeg'
    NON_VEHICLES_SMALL = 'data/non-vehicles_smallset/*/*.jpeg'

    # Used for training the final model in some form or another
    VEHICLES_FULL = 'data/vehicles/*/*.png'
    NON_VEHICLES_FULL = 'data/non-vehicles/*/*.png'
    
    vehicles = glob.glob(VEHICLES_FULL)
    non_vehicles = glob.glob(NON_VEHICLES_FULL)

    return (vehicles, non_vehicles)
In [3]:
cars, notcars = read_data()

print('cars: {}, notcars: {}'.format(len(cars), len(notcars)))
cars: 8792, notcars: 8968

Visualize the data

In [4]:
def create_collage(filenames):
    
    """Create a collage of the images
    
    Parameters:
        • filenames - a list of image filenames
        
    Returns:
        A collage image"""
    
    cols = math.ceil(math.sqrt(len(filenames)))
    rows = math.ceil(len(filenames) / cols)

    w, h, d = (64, 64, 3)

    collage = np.zeros((cols * w, rows * h, d), dtype='uint8')

    col = row = 0

    for i, filename in enumerate(filenames):
        img = cv2.imread(filename)        
        img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
    
        x_pos = col * w
        y_pos = row * h
        collage[x_pos:x_pos + w, y_pos:y_pos + h, :] = img
    
        col += 1
        if col >= cols:
            col = 0
            row += 1
            
    return collage
In [5]:
for data_set, title in zip((cars, notcars), ('Cars', 'Non-cars')):
    collage = create_collage(random.sample(data_set, min(len(data_set), 9)))
    plt.figure()
    plt.title(title)
    plt.imshow(collage);

1. Feature selection

Select the features, on which to train the classifier

In [6]:
def convert_color(img, to_color_space='RGB', from_color_space='BGR'):

    """Helper function to convert an image from one color space to another.
    The assumption is that the image was read in using OpenCV, hence the
    'BGR' color space default.
    
    Parameters:
        • img - input image to convert
        • to_color_space - desired color space (default: 'RGB')
        • from_color_space - input color space (default: 'BGR')
        
    Returns:
        An image in the new color space or the original image if there was an error"""
    
    if to_color_space == from_color_space:
        converted_img = np.copy(img)
    else:
        try:
            # get the conversion identifier to use
            conversion = getattr(cv2, 'COLOR_{}2{}'.format(from_color_space, to_color_space))
        except AttributeError as e:
            print("ERROR: {}".format(e.args))
            return img

        # convert image to new color space (if specified)
        converted_img = cv2.cvtColor(img, conversion)

    return converted_img
In [7]:
def CLAHE(img, from_color_space='BGR', tile_size=4):

    ### Contrast Limited Adaptive Histogram Equalization
    ### http://docs.opencv.org/3.1.0/d5/daf/tutorial_py_histogram_equalization.html

    if len(img.shape) > 2 and img.shape[2] > 1:
        y = convert_color(img, 'YUV', from_color_space)[:, :, 0]
    elif len(img.shape) > 2:
        y = img[:, :, 0]
    else:
        y = img

    clahe = cv2.createCLAHE(tileGridSize=(tile_size, tile_size))
    return clahe.apply(y)
In [8]:
def bin_spatial(img, size=(32, 32)):

    """Extract the spatial binned color features
    
    Parameters:
        • img - input image
        • size - reduced size of image to use as features
        
    Returns:
        A list of color features based on the resized image"""

    # use cv2.resize().ravel() to create the feature vector
    features = cv2.resize(img, size).ravel()

    # Return the feature vector
    return features
In [9]:
def color_hist(img, nbins=32, bins_range=(0, 256)):

    """Calculate a histogram for each color channel in the image and create a list of features from them.
    
    Parameters:
        • img - input image
        • nbins - number of bins in the histogram
        • bins_range - lower and upper range of the bins (above and below are ignored)
        
    Returns:
        A list of color histogram features for the image"""
    
    # Compute the histogram of the color channels separately
    if len(img.shape) > 2:
        hist_features = np.concatenate([np.histogram(img[:, :, c], bins=nbins, range=bins_range)[0] for c in range(img.shape[-1])])
    else:
        hist_features = np.array([np.histogram(img, bins=nbins, range=bins_range)[0]])
        
    # return the individual histograms, bin_centers and feature vector
    return hist_features
In [10]:
def get_hog_features(img, orient, pix_per_cell, cell_per_block, vis=False, feature_vec=True):
    
    """Extract the Histogram of Oriented Gradient (HOG) features for the image.
    
    Parameters:
        • img - input image
        • orient - number of orientations for HOG features
        • pix_per_cell - cell size over which each gradient histogram is computed
        • cell_per_block - specifies the local area over which the histogram counts in a given cell will be normalized
        • vis - boolean to enable a visualization of the HOG
        • feature_vec - boolean to return the data as a feature vector
        
    Returns:
        The HOG features will be returned either multidimensional or as a feature vector depending on `feature_vec`.
        If `vis` is true, an image representation of the HOG is also returned."""
    
    # Call with two outputs if vis==True
    if vis == True:
        features, hog_image = hog(img, orientations=orient, pixels_per_cell=(pix_per_cell, pix_per_cell),
                                  cells_per_block=(cell_per_block, cell_per_block), transform_sqrt=True,
                                  visualise=vis, feature_vector=feature_vec)
        return features, hog_image
    # Otherwise call with one output
    else:
        features = hog(img, orientations=orient, pixels_per_cell=(pix_per_cell, pix_per_cell),
                       cells_per_block=(cell_per_block, cell_per_block), transform_sqrt=True,
                       visualise=vis, feature_vector=feature_vec)
        return features
In [11]:
def extract_features(imgs, color_space='RGB', spatial_size=(32, 32),
                        hist_bins=32, orient=9,
                        pix_per_cell=8, cell_per_block=2, hog_channel=0, hog_clahe=True,
                        spatial_feat=True, hist_feat=True, hog_feat=True):
    
    """Extract features from the input images based on the parameters passed in.
    
    Parameters:
        • imgs - a list of input images
        • color_space - desired color space to extract features
        • spatial_size - size for spacial binning of color features
        • hist_bins - number of bins for the color histogram features
        • orient - number of orientations for HOG features
        • pix_per_cell - cell size over which each gradient histogram is computed
        • cell_per_block - specifies the local area over which the histogram counts in a given cell will be normalized
        • hog_channel - image channel to apply the Histogram of Oriented Gradient (HOG)
        • hog_clahe - boolean to use the Contrast Limited Adaptive Histogram Equalization for the HOG input
        • spatial_feat - boolean to enable spatial binning of color features
        • hist_feat - boolean to enable color histogram features
        • hog_feat - boolean to enable HOG features
        
    Returns:
        A list of features per image"""
    
    # create a list to append feature vectors to
    features = []
    
    # iterate through the list of images
    for file in imgs:

        # features for this single file
        file_features = []

        # read in each one by one
        img = cv2.imread(file)

        # apply color conversion if other than 'RGB'
        feature_img = convert_color(img, color_space)

        # normalize the image
        #feature_img = feature_img.astype(np.float32)/255
        
        # extract spatial binning of color features, if enabled
        if spatial_feat:
            spatial_features = bin_spatial(feature_img, size=spatial_size)
            file_features.append(spatial_features)
            
        # extract color histogram features, if enabled
        if hist_feat:
            hist_features = color_hist(feature_img, nbins=hist_bins)
            file_features.append(hist_features)
            
        if hog_feat:
            if hog_clahe:
                feature_img[:, :, 0] = CLAHE(feature_img)
                hog_channel = (hog_channel == 'ALL') and hog_channel or 0
            if hog_channel == 'ALL':
                hog_features = []
                for channel in range(feature_img.shape[2]):
                    hog_features.append(get_hog_features(feature_img[:,:,channel],
                                        orient, pix_per_cell, cell_per_block,
                                        vis=False, feature_vec=True))
                hog_features = np.ravel(hog_features)
            else:
                hog_features = get_hog_features(feature_img[:,:,hog_channel], orient,
                            pix_per_cell, cell_per_block, vis=False, feature_vec=True)
            # Append the new feature vector to the features list
            file_features.append(hog_features)
        features.append(np.concatenate(file_features))
    # Return list of feature vectors
    return features
In [12]:
color_space = 'YUV' # can be RGB, HSV, LUV, HLS, YUV, YCrCb
orient = 9  # HOG orientations
pix_per_cell = 8 # HOG pixels per cell
cell_per_block = 2 # HOG cells per block
hog_channel = "ALL" # can be 0, 1, 2, or "ALL"
hog_clahe = False # use the CLAHE for calculating the HOG
spatial_size = (16, 16) # spatial binning dimensions
hist_bins = 16    # number of histogram bins
spatial_feat = True # spatial features on or off
hist_feat = True # histogram features on or off
hog_feat = True # HOG features on or off

car_features = extract_features(cars, color_space=color_space,
                        spatial_size=spatial_size, hist_bins=hist_bins,
                        orient=orient, pix_per_cell=pix_per_cell,
                        cell_per_block=cell_per_block,
                        hog_channel=hog_channel, hog_clahe=hog_clahe, spatial_feat=spatial_feat,
                        hist_feat=hist_feat, hog_feat=hog_feat)
notcar_features = extract_features(notcars, color_space=color_space,
                        spatial_size=spatial_size, hist_bins=hist_bins,
                        orient=orient, pix_per_cell=pix_per_cell,
                        cell_per_block=cell_per_block,
                        hog_channel=hog_channel, hog_clahe=hog_clahe, spatial_feat=spatial_feat,
                        hist_feat=hist_feat, hog_feat=hog_feat)

2. Prepare the data

Prepare the data for training the classifier. Data needs to be normalized, shuffled and split into training and testing data.

In [13]:
def prepare_data(true_features, false_features):
    
    """Helper function to prepare the data. This function will normalize the inputs and create the appropriate labels.
    
    Parameters:
        • true_features - features associated with a positive response
        • false_features - features associated with a negative response
        
    Returns:
        A tuple of input features (X), associated labels (y), and the scaler used"""
    
    # create an array stack of feature vectors
    X = np.vstack((car_features, notcar_features)).astype(np.float64)
    
    # fit a per-column scaler
    scaler = StandardScaler().fit(X)
    
    # apply the scaler to X
    scaled_X = scaler.transform(X)
    
    # define the labels vector
    y = np.hstack((np.ones(len(true_features)), np.zeros(len(false_features))))
    
    return scaled_X, y, scaler
In [14]:
def split_data(X, y, test_size=0.2, random_state=None):
    
    """Helper function to split data into training and testing data
    
    Parameters:
        • X - input features matrix
        • y - associated labels
        • test_size - percentage of the data to be used for testing [0.0, 1.0]
        • random_state - number to determine the random state of the algorithm. Useful for consistent testing.
        
    Returns:
        A tuple of X training, X testing, y training, y testing split appropriately."""
    
    # select the random state to initialize
    random_state = random_state is None and np.random.randint(0, 100) or random_state

    # split up data into randomized training and test sets
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=random_state)
    
    return X_train, X_test, y_train, y_test
In [15]:
# create the input feature matrix and the labels for the data
X, y, scaler = prepare_data(car_features, notcar_features)

# split the data into training and testing
X_train, X_test, y_train, y_test = split_data(X, y)

3. Train a classifier

In [16]:
def train_classifier(X, y, classifier='LinearSVM'):
    
    """Create and train a classifier.
    
    Parameters:
        • X - input feature matrix
        • y - labels
        • classifer - string indicating desired classifer type
        
    Returns:
        A trained classifier"""
    
    # select the classifier
    if classifier == 'LinearSVM':
        c = LinearSVC()
    else:
        print("WARNING: {} is not currently supported, defaulting to 'LinearSVM'")
        c = LinearSVC()
      
    # train the classifier
    c.fit(X, y)
    
    return c
In [17]:
# create a linear SVM
svm = train_classifier(X_train, y_train, classifier='LinearSVM')

4. Test the classifier

In [18]:
def test_classifier(c, X, y):
    
    """Test the classifier on the inputs provided
    
    Parameter:
        • c - trained classifier
        • X - input feature matrix
        • y - labels
        
    Returns:
        A float representing the accuracy of the model on this test data."""
    
    return c.score(X, y)
In [19]:
# test the classifier
accuracy = test_classifier(svm, X_test, y_test)

print('Classifier accuracy: {:.2f}%'.format(accuracy * 100))
Classifier accuracy: 99.07%

5. Save the classifier

In [20]:
def save_classifier(params, name='classifier'):
    
    """Helper function to save the classifer
    
    Parameters:
        • params - a dict where the values are the classifier and all the parameters used to generate it
        • name - name of the classifier to be used in the file name"""
    
    with open('model/{}.p'.format(name), 'wb') as f:
        pickle.dump(params, f)    
In [21]:
params = {'classifier': svm,
    'scaler': scaler,
    'color_space': color_space,
    'orient': orient,
    'pix_per_cell': pix_per_cell,
    'cell_per_block': cell_per_block,
    'hog_channel': hog_channel,
    'hog_clahe': hog_clahe,
    'spatial_size': spatial_size,
    'hist_bins': hist_bins,
    'spatial_feat': spatial_feat,
    'hist_feat': hist_feat,
    'hog_feat': hog_feat,
}

save_classifier(params, 'LinearSVM_Full')

Vehicle detection

0. Load the classifier

In [22]:
def load_classifier(name='classifier'):
    
    """Helper function to load a classifier
    
    Parameters:
        • name - name of the classifer file without the extension or path
        
    Returns:
        A dict where the values are the classifer and all the parameters used to generate it
        loaded from the file model/`name`.p"""
    
    with open('model/{}.p'.format(name), 'rb') as f:
        params = pickle.load(f)
        
    return params
In [23]:
# load the classifier
params = load_classifier('LinearSVM_Full')

svm = params['classifier']
scaler = params['scaler']
color_space = params['color_space']
orient = params['orient']
pix_per_cell = params['pix_per_cell']
cell_per_block = params['cell_per_block']
hog_channel = params['hog_channel']
hog_clahe = params['hog_clahe']
spatial_size = params['spatial_size']
hist_bins = params['hist_bins']
spatial_feat = params['spatial_feat']
hist_feat = params['hist_feat']
hog_feat = params['hog_feat']

1. Search for matches

In [24]:
def draw_boxes(img, boxes, color=(0, 0, 255), thickness=6):
    
    """Helper function to draw boxes on an image.
    
    Parameters:
        • img - input image
        • boxes - list of bounding boxes ((x_min, y_min), (x_max, y_max))
        • color - a tuple of RGB values [0, 255]
        • thickness - thickness of the lines to draw
        
    Returns:
        The image with the boxes drawn on top of it"""
    
    draw_img = np.copy(img)
    
    for box in boxes:
        point_min, point_max = box
        
        cv2.rectangle(draw_img, point_min, point_max, color, thickness)
        
    return draw_img
In [25]:
def potential_cars(img, y_min, y_max, scale, svc, X_scaler, to_color_space, from_color_space, orient, pix_per_cell, 
                   cell_per_block, hog_channel, hog_clahe, spatial_size, hist_bins):
    
    """Using a sliding window determine areas of the image representing potential cars using a classifier and some 
    parameters to generate the features to feed into it.
    
    Parameters:
        • img - input image
        • y_min - minimum y value in image to consider
        • y_max - maximum y value in image to consider
        • scale - scale of the sliding window to use
        • X_scaler - feature scaler used on the training features for the classifier
        • to_color_space - desired color space to extract features
        • from_color_space - original color space of the image
        • orient - number of orientations for HOG features
        • pix_per_cell - cell size over which each gradient histogram is computed
        • cell_per_block - specifies the local area over which the histogram counts in a given cell will be normalized
        • hog_channel - image channel to apply the Histogram of Oriented Gradient (HOG)
        • hog_clahe - boolean to use the Contrast Limited Adaptive Histogram Equalization for the HOG input
        • spatial_size - size for spacial binning of color features
        • hist_bins - number of bins for the color histogram features
    
    Returns:
        A list of windows that matched a car"""
    
    # define the region of the image to search
    search_img = img[y_min:y_max, :, :]
    
    # convert to the desired color space
    search_img = convert_color(search_img, to_color_space, from_color_space)
    
    # if the scale is not 1, resize the search image (instead of the sliding window)
    if scale != 1:
        height, width, channels = search_img.shape
        search_img = cv2.resize(search_img, (np.int(width / scale), np.int(height / scale)))
    
    height, width, channels = search_img.shape

    # Define blocks and steps as above
    nxblocks = (width // pix_per_cell) - 1
    nyblocks = (height // pix_per_cell) - 1 
    nfeat_per_block = orient * cell_per_block ** 2

    # 64 was the orginal sampling rate, with 8 cells and 8 pix per cell
    window = 64
    nblocks_per_window = (window // pix_per_cell)-1 
    cells_per_step = 2  # Instead of overlap, define how many cells to step
    nxsteps = (nxblocks - nblocks_per_window) // cells_per_step
    nysteps = (nyblocks - nblocks_per_window) // cells_per_step
    
    # Compute individual channel HOG features for the entire image
    hogs = []
    
    hog_img = np.copy(search_img)

    if hog_clahe:
        hog_img[:, :, 0] = CLAHE(hog_img, from_color_space)
        hog_channel = (hog_channel == 'ALL') and hog_channel or 0
        plt.figure()
        plt.imshow(convert_color(hog_img, 'RGB', 'YCrCb'))
        
    if hog_channel == 'ALL':
        for c in range(3):
            hogs.append(get_hog_features(hog_img[:, :, c], orient, pix_per_cell, cell_per_block, 
                                         feature_vec=False))
    else:
        hogs.append(get_hog_features(hog_img, orient, pix_per_cell, cell_per_block, feature_vec=False))
    
    bounding_boxes = []
    
    for xb in range(nxsteps):
        for yb in range(nysteps):
            ypos = yb * cells_per_step
            xpos = xb * cells_per_step
            # Extract HOG for this patch
            hog_feats = [h[ypos:ypos + nblocks_per_window, xpos:xpos + nblocks_per_window].ravel() for h in hogs]
            hog_features = np.hstack(hog_feats)

            xleft = xpos*pix_per_cell
            ytop = ypos*pix_per_cell

            # Extract the image patch
            subimg = cv2.resize(search_img[ytop:ytop + window, xleft:xleft + window], (64,64))
          
            # Get color features
            spatial_features = bin_spatial(subimg, size=spatial_size)
            hist_features = color_hist(subimg, nbins=hist_bins)

            # Scale features and make a prediction
            test_features = X_scaler.transform(np.hstack((spatial_features, hist_features, hog_features))
                                               .reshape(1, -1))
            test_prediction = svc.predict(test_features)
            
            if test_prediction == 1:
                xbox_left = np.int(xleft * scale)
                ytop_draw = np.int(ytop * scale)
                win_draw = np.int(window * scale)
                bounding_boxes.append(((xbox_left, ytop_draw + y_min), 
                                       (xbox_left + win_draw, ytop_draw + win_draw + y_min)))
                
    return bounding_boxes
In [26]:
y_min = 400
y_max = 656
scales = [1.0, 1.5]

test_imgs = glob.glob('test_images/*.jpg')

all_bboxes = []

for test_img in test_imgs:
    img = cv2.imread(test_img)
    
    bboxes = []
    for scale in scales:
        scaled_bboxes = potential_cars(img, y_min, y_max, scale, svm, scaler, color_space, 'BGR', orient, 
                                       pix_per_cell, cell_per_block, hog_channel, hog_clahe, spatial_size, hist_bins) 
        bboxes.extend(scaled_bboxes)
        
    all_bboxes.append(bboxes)
    detected_img = draw_boxes(convert_color(img, 'RGB'), bboxes)
    plt.figure()
    plt.imshow(detected_img);

2. Convert to a heatmap

In [27]:
def labels_to_bboxes(labels):
    
    """Helper function to convert labels from a heatmap into smallest fitting bounding boxes
    
    Parameters:
        • labels - labels as returned by the scipy.ndimage.measurements.label function
        
    Returns:
        A list of bounding boxes"""
    
    bboxes = []
    
    for label in range(1, labels[1] + 1):

        # find pixels with each car_number label value
        nonzero = (labels[0] == label).nonzero()
        
        # identify x and y values of those pixels
        nonzeroy = np.array(nonzero[0])
        nonzerox = np.array(nonzero[1])
        
        # define a bounding box based on min/max x and y
        bbox = ((np.min(nonzerox), np.min(nonzeroy)), (np.max(nonzerox), np.max(nonzeroy)))
        
        # add to the list of bounding boxes
        bboxes.append(bbox)
        
    return bboxes
In [28]:
def heatmapify(img, bboxes, threshold=None):
    
    """Create a heatmap in the image based on a list of bounding boxes
    
    Parameters:
        • img - input image
        • bboxes - list of bounding boxes of potential matches
        • threshold - threshold to filter out false positives (integer or `None`)
        
    Returns:
        A tuple of the heatmap image and a list of bounding boxes of hot areas"""
    
    # create an empty heatmap
    heatmap = np.zeros_like(img[:, :, 0]).astype(np.float)
    
    # iterate through boxes
    for bbox in bboxes:
        
        # extract minimum and maximum points of the bounding boxes
        point_min, point_max = bbox
        
        # increment pixels in the heatmap for all pixels within the bounding box
        heatmap[point_min[1]:point_max[1], point_min[0]:point_max[0]] += 1
    
    # if there is a threshold, apply it
    if threshold is not None:
        heatmap[heatmap <= threshold] = 0
    
    # clip the heat map for visualizing purposes
    heatmap = np.clip(heatmap, 0, 255)
    
    # create a list of boxes for remaining hot areas
    labels = label(heatmap)
    final_bboxes = labels_to_bboxes(labels)
    
    return heatmap, final_bboxes
In [29]:
for test_img, bboxes in zip(test_imgs, all_bboxes):
    img = cv2.imread(test_img)
    heatmap, final_bboxes = heatmapify(img, bboxes, threshold=1)
    detected_img = draw_boxes(convert_color(img, 'RGB'), final_bboxes)

    f, (ax1, ax2) = plt.subplots(1, 2, figsize=(20,10))
    ax1.imshow(heatmap, cmap='hot')
    ax2.imshow(detected_img)

Pipeline

Put all the vehicle detection pieces together into a single function

In [30]:
def find_cars(img, classifier, scaler, params=[]):
    
    """Find the cars in an image and overlay bounding boxes on them.
    
    Parameters:
        • img - input image to search
        • classifier - classifier to use to search for cars
        • scaler - scaler used on the feature vector
        • params - dictionary of parameters to control the search (see `potential_cars` and `heatmapify` funcs)
        
    Returns:
        An image with the cars marked by boxes"""
    
    
    # extract parameters from the params dictionary or use default values
    to_color_space = params.get('to_color_space', 'RGB')
    from_color_space = params.get('from_color_space', 'RGB')
    y_min = params.get('y_min', 400)
    y_max = params.get('y_max', 656)
    scales = params.get('scales', [1.5])
    orient = params.get('orient', 9)
    pix_per_cell = params.get('pix_per_cell', 8)
    cell_per_block = params.get('cell_per_block', 2)
    hog_channel = params.get('hog_channel', 0)
    hog_clahe = params.get('hog_clahe', True)
    spatial_size = params.get('spatial_size', (32, 32))
    hist_bins = params.get('hist_bins', 32)
    spatial_feat = params.get('spatial_feat', True)
    hist_feat = params.get('hist_feat', True)
    hog_feat = params.get('hog_feat', True)
    threshold = params.get('threshold', 1)
    prev_frame_cnt = params.get('prev_frame_cnt', 0)
    prev_frames_heat = params.get('prev_frames_heat', [])
    
    # get the bounding boxes of matches at all scales
    bboxes = []
    for scale in scales:
        scaled_bboxes = potential_cars(img, y_min, y_max, scale, svm, scaler, to_color_space, from_color_space, 
                                       orient, pix_per_cell, cell_per_block, hog_channel, hog_clahe, spatial_size, 
                                       hist_bins) 
        bboxes.extend(scaled_bboxes)

    # take previous frames into account by merging their detected bounding boxes into the inputs for the heatmap
    if prev_frame_cnt > 0:
        # threshold should increase linearly with the number of previous frames being added to the heatmap
        threshold += len(prev_frames_heat)
        
        # add this frame's bounding boxes to the list
        prev_frames_heat.append(bboxes)
        
        # combine all bounding boxes into one list (instead of a list of lists)
        bboxes = list(itertools.chain.from_iterable(prev_frames_heat))
        
        # if we hit the limit of frames saved, remove the first one (FIFO)
        if len(prev_frames_heat) > prev_frame_cnt:
            prev_frames_heat = prev_frames_heat[1:]
            
        # reset the dictionary value for the previous frames heat
        params['prev_frames_heat'] = prev_frames_heat
        
    # create a heatmap based on the bounding boxes and threshold it
    heatmap, final_bboxes = heatmapify(img, bboxes, threshold=threshold)

    # create a copy of the image in RGB to draw on
    detected_img = convert_color(img, 'RGB', from_color_space)
    
    # draw the final bounding boxes on the image
    detected_img = draw_boxes(detected_img, final_bboxes)
    
    return detected_img

Test the pipeline

In [31]:
# load the classifier
params = load_classifier('LinearSVM_Full')

params['to_color_space'] = params['color_space']
params['from_color_space'] = 'BGR'
params['y_min'] = 400
params['y_max'] = 656
params['scales'] = [1.0, 1.5]
params['threshold'] = 1

# create list of test images
test_imgs = glob.glob('test_images/*.jpg')

# cycle through the test images
for test_img in test_imgs:
    
    # read in the image
    img = cv2.imread(test_img)
    
    # run the pipeline
    detected_img = find_cars(img, params['classifier'], params['scaler'], params)
    
    # display the results
    plt.figure()
    plt.imshow(detected_img);
    

Test on Videos

Let's try the pipeline on some video

In [32]:
# import everything needed to edit/save/watch video clips
from moviepy.editor import VideoFileClip
from IPython.display import HTML

0. Load the classifier and set up params

In [33]:
# load the classifier
params = load_classifier('LinearSVM_Full')

params['to_color_space'] = params['color_space']
params['from_color_space'] = 'RGB'
params['y_min'] = 400
params['y_max'] = 656
params['scales'] = [1.0, 1.5]
params['threshold'] = 1

classifier = params['classifier']
scaler = params['scaler']

1. test_video.mp4

In [34]:
# set parameters for the find_cars pipeline
params['to_color_space'] = 'YCrCb'
# use the previous 3 frames when deciding which matches are valid
params['prev_frame_cnt'] = 3
params['prev_frames_heat'] = []
params['threshold'] = 3

# run test video through the pipeline
test_output = 'test_output.mp4'
clip1 = VideoFileClip("test_video.mp4")
test_clip = clip1.fl_image(lambda x: find_cars(x, classifier, scaler, params)) #NOTE: this function expects color images!!
%time test_clip.write_videofile(test_output, audio=False)
[MoviePy] >>>> Building video test_output.mp4
[MoviePy] Writing video test_output.mp4
 97%|█████████▋| 38/39 [00:43<00:01,  1.16s/it]
[MoviePy] Done.
[MoviePy] >>>> Video ready: test_output.mp4 

CPU times: user 41.9 s, sys: 1.22 s, total: 43.1 s
Wall time: 44.9 s
In [35]:
HTML("""
<video width="960" height="540" controls>
  <source src="{0}">
</video>
""".format(test_output))
Out[35]:

2. project_video.mp4

In [36]:
# set parameters for the find_cars pipeline
params['to_color_space'] = 'YCrCb'
# use the previous 3 frames when deciding which matches are valid
params['prev_frame_cnt'] = 3
params['prev_frames_heat'] = []
params['threshold'] = 3

# run test video through the pipeline
project_output = 'project_output.mp4'
clip2 = VideoFileClip("project_video.mp4")
project_clip = clip2.fl_image(lambda x: find_cars(x, classifier, scaler, params)) #NOTE: this function expects color images!!
%time project_clip.write_videofile(project_output, audio=False)
[MoviePy] >>>> Building video project_output.mp4
[MoviePy] Writing video project_output.mp4
100%|█████████▉| 1260/1261 [23:16<00:01,  1.10s/it]
[MoviePy] Done.
[MoviePy] >>>> Video ready: project_output.mp4 

CPU times: user 22min 36s, sys: 32.9 s, total: 23min 9s
Wall time: 23min 17s
In [37]:
HTML("""
<video width="960" height="540" controls>
  <source src="{0}">
</video>
""".format(project_output))
Out[37]:
In [ ]: